18 research outputs found
Revisiting Taxonomy Induction over Wikipedia
Guided by multiple heuristics, a unified taxonomy of entities and categories is distilled from the Wikipedia category network. A comprehensive evaluation, based on the analysis of upward generalization paths, demonstrates that the taxonomy supports generalizations which are more than twice as accurate as the state of the art. The taxonomy is available at http://headstaxonomy.com
GIANT: Scalable Creation of a Web-scale Ontology
Understanding what online users may pay attention to is key to content
recommendation and search services. These services will benefit from a highly
structured and web-scale ontology of entities, concepts, events, topics and
categories. While existing knowledge bases and taxonomies embody a large volume
of entities and categories, we argue that they fail to discover properly
grained concepts, events and topics in the language style of online population.
Neither is a logically structured ontology maintained among these notions. In
this paper, we present GIANT, a mechanism to construct a user-centered,
web-scale, structured ontology, containing a large number of natural language
phrases conforming to user attentions at various granularities, mined from a
vast volume of web documents and search click graphs. Various types of edges
are also constructed to maintain a hierarchy in the ontology. We present our
graph-neural-network-based techniques used in GIANT, and evaluate the proposed
methods as compared to a variety of baselines. GIANT has produced the Attention
Ontology, which has been deployed in various Tencent applications involving
over a billion users. Online A/B testing performed on Tencent QQ Browser shows
that Attention Ontology can significantly improve click-through rates in news
recommendation.Comment: Accepted as full paper by SIGMOD 202
Modeling, Indexing and Retrieving Images using Conceptual Graphs
. When dealing with the complexity of an image as part of the indexing process, keywords are not sufficient to obtain an index that is a faithful representation of the image content. We propose to use the conceptual graphs formalism as the indexing language, which allows to use not only keywords, but also relations between them. The obtained indexes are more precise, and retrieval effectiveness is thus improved. Our paper presents a system that provides a computer-assisted image indexing process, which is performed according to a formal image model. The result of the indexing process, which is a set of conceptual graphs, is then organized so that to improve retrieval execution times. Our image retrieval system, called RELIEF, is implemented on an object-oriented DBMS and is available on the Web. It ensures the management of an image test collection and gives good results, with respect to both execution time and quality of answers. 1 Introduction: Towards Precision-Oriented ..
Finding the Best Parameters for Image Ranking: a User-Oriented Approach
Image ranking is a task that involves different parameters. They depend on the intrinsic characteristics of an image, but also on the indexing language used for representing its semantic content. We developed a weighting model that combines these parameters in a general scheme. Finding the best balance between the parameters is not straightforward. Different parameter combinations leads to different rankings, which may be more or less accepted by the users. In this paper, we choose a set of test queries and present the impact of the parameters on the rank of each image. Different combinations are discussed, and the best combination is specified. For the evaluation, we follow a user-oriented approach, and compare the ranking provided by each parameter combination to the ranking given by human judgment. This is a step toward a user-centered image retrieval system, which will dynamically adapt to the user's profile and preferences. 1 Introduction Images constitute a complex type of medi..
RELIEF: Combining expressiveness and rapidity into a single system
This paper constitutes a proposal for an efficient and effective logical information retrieval system. Following a relational indexing approach, which is in our opinion a necessity to cope with the emerging applications such as those based on multimedia, we use the conceptual graphs formalism as our indexing language. This choice allows for relational indexing support and captures all the useful properties of the logical information retrieval model, in a workable system. First order logic and standard information retrieval techniques are combined together, to the same effect: obtaining an expressive system, able to accurately handle complex documents, improve retrieval effectiveness, and achieve good time performance. Experimentations on an image test collection, within a system available on the Web, provide an illustration of the role that logic may have in the future development of information retrieval systems. 1 Introduction The emergence of new applications, such as those based ..
The RELIEF Retrieval System
We introduce in this paper the RELIEF retrieval system, a system for image retrieval based on the conceptual graph formalism. Conceptual graphs can be used as a simple and expressive language for indexing and retrieving non-- textual documents. In this formalism the implementation of the matching function between a query and a document is obtained by using the so-called projection operator between two conceptual graphs. However, the first implementations of this operator have shown its lack of efficiency, as it is based on an exponential algorithm. The RELIEF system supports a new polynomial matching function for conceptual graphs which turns out to be equivalent to the projection operator but it has the important feature that it incorporates reasoning about relations. This allows for better qualitative results as well as improved time performance compared to the original system implementing the classical projection. RELIEF is developed on top of the object oriented DBMS O 2 , and any ..
High Performance Question/Answering
In this paper we present the features of a Question/Answering (Q/A) system that had unparalleled performance in the TREC-9 evaluations. We explain the accuracy of our system through the unique characteristics of its architecture: (1) usage of a wide-coverage answer type taxonomy; (2) repeated passage retrieval; (3) lexico-semantic feedback loops; (4) extraction of the answers based on machine learning techniques; and (5) answer caching. Experimental results show the eects of each feature on the overall performance of the Q/A system and lead to general conclusions about Q/A from large text collections